dependency model
Identifiability and Unmixing of Latent Parse Trees
This paper explores unsupervised learning of parsing models along two directions. First, which models are identifiable from infinite data? We use a general technique for numerically checking identifiability based on the rank of a Jacobian matrix, and apply it to several standard constituency and dependency parsing models. Second, for identifiable models, how do we estimate the parameters efficiently? EM suffers from local optima, while recent work using spectral methods [1] cannot be directly applied since the topology of the parse tree varies across sentences. We develop a strategy, unmixing, which deals with this additional complexity for restricted classes of parsing models.
Entropy Regularized Optimal Transport Independence Criterion
Liu, Lang, Pal, Soumik, Harchaoui, Zaid
Statistical independence measures have been widely used in machine learning and statistics, ranging from independence component analysis (Bach and Jordan, 2002; Gretton et al., 2005) to causal inference (Pfister et al., 2018; Chakraborty and Zhang, 2019), and recently in self-supervised learning (Li et al., 2021) and representation learning (Ozair et al., 2019). Classical dependence measures such as Pearson's correlation coefficient, Spearman's ρ, and Kendall's τ (Hoeffding, 1948; Kruskal, 1958; Lehmann, 1966) focus on real-valued one dimensional random variables and thus are not suitable for high dimensional data; see also (Schweizer and Wolff, 1981; Nikitin, 1995). One popular choice of independence measures in high dimension is the Hilbert-Schmidt independence criterion (HSIC) (Gretton et al., 2005). This criterion was used to develop an independence test by Gretton et al. (2007b). Several extensions of HSIC are available, such as a relative dependency measure (Bounliphone et al., 2015) and a joint independence measure among multiple random elements (Pfister et al., 2018). Another choice is the distance covariance (dCov) of Székely et al. (2007).
- Asia > Middle East > Jordan (0.24)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
Automated Construction of Sparse Bayesian Networks from Unstructured Probabilistic Models and Domain Information
Srinivas, Sampath, Russell, Stuart, Agogino, Alice M.
An algorithm for automated construction of a sparse Bayesian network given an unstructured probabilistic model and causal domain information from an expert has been developed and implemented. The goal is to obtain a network that explicitly reveals as much information regarding conditional independence as possible. The network is built incrementally adding one node at a time. The expert's information and a greedy heuristic that tries to keep the number of arcs added at each step to a minimum are used to guide the search for the next node to add. The probabilistic model is a predicate that can answer queries about independencies in the domain. In practice the model can be implemented in various ways. For example, the model could be a statistical independence test operating on empirical data or a deductive prover operating on a set of independence statements about the domain.
- North America > United States > California > Alameda County > Berkeley (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- North America > United States > California > San Mateo County > San Mateo (0.04)
- North America > United States > California > San Mateo County > Menlo Park (0.04)
An Algorithm for Deciding if a Set of Observed Independencies Has a Causal Explanation
In a previous paper [Pearl and Verma, 1991] we presented an algorithm for extracting causal influences from independence information, where a causal influence was defined as the existence of a directed arc in all minimal causal models consistent with the data. In this paper we address the question of deciding whether there exists a causal model that explains ALL the observed dependencies and independencies. Formally, given a list M of conditional independence statements, it is required to decide whether there exists a directed acyclic graph (dag) D that is perfectly consistent with M, namely, every statement in M, and no other, is reflected via dseparation in D. We present and analyze an effective algorithm that tests for the existence of such a day, and produces one, if it exists.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Europe > Germany > Rheinland-Pfalz > Mainz (0.04)
- North America > United States > California > San Mateo County > San Mateo (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
Deriving a Minimal I-map of a Belief Network Relative to a Target Ordering of its Nodes
Matzkevich, Izhar, Abramson, Bruce
This paper identifies and solves a new optimization problem: Given a belief network (BN) and a target ordering on its variables, how can we efficiently derive its minimal I-map whose arcs are consistent with the target ordering? We present three solutions to this problem, all of which lead to directed acyclic graphs based on the original BN's recursive basis relative to the specified ordering (such a DAG is sometimes termed the boundary DAG drawn from the given BN relative to the said ordering [5]). Along the way, we also uncover an important general principal about arc reversals: when reordering a BN according to some target ordering, (while attempting to minimize the number of arcs generated), the sequence of arc reversals should follow the topological ordering induced by the original belief network's arcs to as great an extent as possible. These results promise to have a significant impact on the derivation of consensus models, as well as on other algorithms that require the reconfiguration and/or combination of BN's.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Minnesota (0.04)
- North America > United States > California > San Mateo County > San Mateo (0.04)
- Europe > United Kingdom > England (0.04)
Algorithms for Learning Decomposable Models and Chordal Graphs
de Campos, Luis M., Huete, Juan F.
Decomposable dependency models and their graphical counterparts, i.e., chordal graphs, possess a number of interesting and useful properties. On the basis of two characterizations of decomposable models in terms of independence relationships, we develop an exact algorithm for recovering the chordal graphical representation of any given decomposable model. We also propose an algorithm for learning chordal approximations of dependency models isomorphic to general undirected graphs.
- North America > United States > New York (0.04)
- Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Identifiability and Unmixing of Latent Parse Trees
Hsu, Daniel J., Kakade, Sham M., Liang, Percy S.
This paper explores unsupervised learning of parsing models along two directions. First, which models are identifiable from infinite data? We use a general technique for numerically checking identifiability based on the rank of a Jacobian matrix, and apply it to several standard constituency and dependency parsing models. Second, for identifiable models, how do we estimate the parameters efficiently? EM suffers from local optima, while recent work using spectral methods [1] cannot be directly applied since the topology of the parse tree varies across sentences. We develop a strategy, unmixing, which deals with this additional complexity for restricted classes of parsing models.
Identifiability and Unmixing of Latent Parse Trees
Hsu, Daniel, Kakade, Sham M., Liang, Percy
This paper explores unsupervised learning of parsing models along two directions. First, which models are identifiable from infinite data? We use a general technique for numerically checking identifiability based on the rank of a Jacobian matrix, and apply it to several standard constituency and dependency parsing models. Second, for identifiable models, how do we estimate the parameters efficiently? EM suffers from local optima, while recent work using spectral methods cannot be directly applied since the topology of the parse tree varies across sentences. We develop a strategy, unmixing, which deals with this additional complexity for restricted classes of parsing models.
Learning Inclusion-Optimal Chordal Graphs
Auvray, Vincent, Wehenkel, Louis
Chordal graphs can be used to encode dependency models that are representable by both directed acyclic and undirected graphs. This paper discusses a very simple and efficient algorithm to learn the chordal structure of a probabilistic model from data. The algorithm is a greedy hill-climbing search algorithm that uses the inclusion boundary neighborhood over chordal graphs. In the limit of a large sample size and under appropriate hypotheses on the scoring criterion, we prove that the algorithm will find a structure that is inclusion-optimal when the dependency model of the data-generating distribution can be represented exactly by an undirected graph. The algorithm is evaluated on simulated datasets.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Belgium (0.04)